De-duplicating backup tool on a block basis? [closed]

Posted by SST on Server Fault See other posts from Server Fault or by SST
Published on 2012-10-11T14:48:23Z Indexed on 2012/10/11 15:38 UTC
Read the original article Hit count: 243

Filed under:

backup

|

rsync

|

largefiles

|

deduplication

I am looking for an (ideally free as in speech or beer) backup tool for Unix-like OS which can store deduplicated backups, i.e. only nonredundant content takes up additional space.

I already looked at dirvish (my first candidate) and rsnapshot which use hardlinks to achieve deduplication on a per-file level. However, as I want to back up large files (Thunderbird mailboxes >3GB, VMware images >10GB), such file are stored again entirely even if just a few bytes change. Then there are rsync-based tools like rdiff-backup which only store deltas and a current mirror. However, as the deltas are generated against each previous mirror, it is difficult to fine-tune the retention granularity (only keep one backup after a week, etc.) because the deltas would have to be re-evaluated. Another approach is to partition content into blocks and store each block only if it is not stored yet, otherwise just linking it to the first occurrence. The only tool I know of that does this by now is obnam (http://liw.fi/obnam), and it even supports zlib-compression and gpg-encryption -- nice! But it is very slow, AFAICT.

Does any one know any other, solid backup software which supports deduplication on a sub-file level, ideally with at least some management options (show/select/delete generations...)?

© Server Fault or respective owner

Related posts about backup

backup exec - backup to disk offline

as seen on Server Fault - Search for 'Server Fault'
Hi We are running backup exec 9.1 doing a backup to disk to portable hard disk drives. When we run the backup manually it works fine. But when the backup is setup to run in the evening on a schedule it does not run as the backup to disk folders goes offline and therefore has to be switched back… >>> More
Ideal backup appliance for backup software like Bacula?

as seen on Server Fault - Search for 'Server Fault'
I'm at a small company and we (the IT department of two) manage <100 client computers and a handful of servers. Currently we're using a company's appliance to handle backup; it does a small backup every night and a full backup every weekend, and a guy comes on Wednesday to take an offsite backup… >>> More
Symantec Backup Exec Error on backup

as seen on Server Fault - Search for 'Server Fault'
Recently we have moved some of servers from real servers- into virtual servers. Since then, we are getting errors like the following: Error category : Resource ErrorsError : e000fed1 - A failure occurred querying the Writer status. For additional information regarding this error refer… >>> More
Windows Server Backup - Recover only shows the latest backup

as seen on Server Fault - Search for 'Server Fault'
We're having quite some trouble at work using Windows Server Backup. We have a HyperV server (Win 2008) running 8 virtual web servers, these are running a variety of OS'es: Win 2003, Win 2008 and a lone Debian. Each virtual server has a separate partition on the physical HyperV server, so e.g. E:… >>> More
Failed Backup Job With Backup Exec 12 and AOFO

as seen on Server Fault - Search for 'Server Fault'
I am backing up a Windows 2003 Small Business Server with SP2. We are running Backup Exec 12 with SP4. Recently the backup job started failing on backing up the system state with the following error: V-79-57344-34110 - AOFO: Initialization failure on: "System?State". Advanced Open File Option… >>> More

Related posts about rsync

Rsync: how to mount truecrypt on-the-fly on the receiving side?

as seen on Super User - Search for 'Super User'
The short version: how can I keep an rsync backup on a truecrypt volume? The hard part is to mount/unmount this volume on the fly when it is needed for rsync. Details This is my current backup configuration (which works fairly well for the most part): backup source is on Win7 64 bit, destination… >>> More
Two Questions on for Rsync - rsync by date and by file name

as seen on Stack Overflow - Search for 'Stack Overflow'
Hi, I have two questions with respect to rsync: 1: I have a bunch of files which are incremented by day of the year. Ex: file.txt.81, file.txt.82, etc. Now, these files are in different directories: data1/file.txt.81 data1/file.txt.82 data2/file2.txt.81 data2/file2.txt.82 How can I have rsync… >>> More
Cygwin rsync broken

as seen on Super User - Search for 'Super User'
I get an error with cygwin rsync trying to transfer files between local - remote host. Any ideas? C:\>rsync user@host:~/file newfile Password: rsync: connection unexpectedly closed (0 bytes received so far) [sender] rsync error: error in rsync protocol data stream (code 12) at io.c(601) [sender= 3… >>> More
macport selfupdate not working

as seen on Super User - Search for 'Super User'
macbookpro:~ eistrati$ port -v MacPorts 2.1.2 macbookpro:~ eistrati$ xcodebuild -version Xcode 4.5.2 Build version 4G2008a macbookpro:~ eistrati$ sudo port -d selfupdate DEBUG: Copying /Users/eistrati/Library/Preferences/com.apple.dt.Xcode.plist to /opt/local/var/macports/home/Library/Preferences DEBUG:… >>> More
Rsync from godaddy to OS X

as seen on Super User - Search for 'Super User'
I would like to use rsync to backup my website to my local computer (OS X). I started of with this guide and got pretty far. I use the following rsync-line: rsync -PzrlptgD --del --delete-excluded -r --rsync-path=~/bin/rsync user@server:~/ /local/backup/folder/ I wanted to use the -a option (same… >>> More